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Abstract  -  Key  to  the  appropriate  use  of  data  is  the  knowledge  of 
data  quality.  This  knowledge  is  critical  for  products  and  decision- 
support  tools  that  utilize  real-time  data,  and  it  is  also  essential  for  the 
longer  term  application  of  data  as  well.  Guidance  by  the  National 
Archives  and  Records  Administration  (NARA)  for  appraising 
observational  data  for  archive  states  that  factors  favoring  long-term 
or  permanent  retention  include  the  uniqueness,  completeness,  and 
quality  of  observational  data  and  the  quality  and  completeness  of 
metadata  [1].  The  National  Oceanographic  Data  Center  (NODC), 
the  designated  archive  center  for  oceanographic  data  in  the  U.S., 
requires  that  data  submitted  be  documented  to  enable  secondary  use 
and  ensure  data  posterity.  Such  metadata  should  include  not  only 
geospatial  characteristics  and  time  periods  of  observations,  but  also 
the  collection  methods,  instrumentation  used,  units  of  measure, 
acceptable  values,  error  tolerance,  processing  history,  quality 
assessments  and  explanations  of  quality  flags,  data  aggregation 
methods,  and  other  pertinent  information  [2].  Providing  this 
information  in  a  consistent  manner  can  be  a  challenge.  However,  an 
approach  to  capturing  and  conveying  this  metadata  using 
community-developed  practices  for  ocean  observing  system  data  and 
metadata  is  well  underway. 

This  paper  presents  methods  of  capturing  data  and  provenance 
of  data  quality  using  the  Open  Geospatial  Consortium  (OGC) 
Sensor  Web  Enablement  (SWE)  framework.  It  describes  the  types  of 


metadata  content  captured  and  demonstrates  the  utility  and 
significance  of  defining  and  registering  terms  to  enable  semantic,  as 
well  as  syntactic,  interoperability.  The  SWE  framework  provides  an 
avenue  for  conveying  quality  flags  and  methods  used  to  make 
assurances  about  the  integrity  of  oceanographic  data  for  real-time 
consumption  and  for  potential  submittal  to  permanent  archives  such 
as  NODC. 

I.  Introduction 

A  “grassroots”  activity,  called  QARTOD  (Quality 
Assurance  of  Real-Time  Oceanographic  Data),  funded 
primarily  by  the  National  Oceanic  and  Atmospheric 
Administration  (NOAA),  has  brought  together  data  managers, 
scientists  and  sensor  manufacturers  from  government  and 
private  industry  to  determine  minimum  requirements  in 
quality  assurance  and  quality  control  (QA/QC)  for  real-time 
oceanographic  data.  To  date,  four  QARTOD  workshops  have 
focused  on  waves,  in  situ  currents,  conductivity/ 
temperature/depth  (CTD),  and  dissolved  oxygen  (DO)  data. 

The  OGC  is  a  standards  organization  that  is  leading  the 
development  of  publicly  available,  consensus-based  standards 
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Public  reporting  burden  for  the  collection  of  information  is  estimated  to  average  1  hour  per  response,  including  the  time  for  reviewing  instructions,  searching  existing  data  sources,  gathering  and 
maintaining  the  data  needed,  and  completing  and  reviewing  the  collection  of  information.  Send  comments  regarding  this  burden  estimate  or  any  other  aspect  of  this  collection  of  information, 
including  suggestions  for  reducing  this  burden,  to  Washington  Headquarters  Services,  Directorate  for  Information  Operations  and  Reports,  1215  Jefferson  Davis  Highway,  Suite  1204,  Arlington 
VA  22202-4302.  Respondents  should  be  aware  that  notwithstanding  any  other  provision  of  law,  no  person  shall  be  subject  to  a  penalty  for  failing  to  comply  with  a  collection  of  information  if  it 
does  not  display  a  currently  valid  OMB  control  number. 
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methods,  and  other  pertinent  information  [2].  Providing  this  information  in  a  consistent  manner  can  be  a 
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to  “geo-enable  the  Web.”  The  suite  of  OGC  standards  that 
comprise  the  SWE  framework,  specifically,  the  Sensor 
Observation  Service  (SOS),  which  enables  retrieval  of  data 
and  metadata  from  sensors  and  sensor  systems,  and  Sensor 
Modeling  Language  (SensorML),  which  is  specifically 
designed  to  describe  how  observable  properties  (such  as 
pressure)  are  transformed  into  an  observation  (such  as  wave 
height),  seemed  a  perfect  match  for  oceanographic  sensor 
networks. 

Q20,  short  for  QARTOD  to  OGC,  is  a  project  funded  by 
NOAA  to  implement  QA/QC  standards  for  in  situ  ocean 
sensors  using  the  OGC  SWE  framework.  This  project  brings 
the  OGC  SWE  developers  and  information  technology  (IT) 
specialists  together  with  oceanographers  and  data  managers  to 
develop  specifications,  data  dictionaries,  and  SWE  profiles  for 
the  application  of  QARTOD-identified,  QC  tests,  and  the 
capture  of  QA  information. 

As  data  are  moved  along  the  path  from  the  sensor  (point  of 
origin)  to  the  data  provider  and  on  to  aggregation  centers,  data 
archives  or  consumers,  knowledge  of  data  provenance, 
characteristics  of  the  data  source,  system  configurations,  and 
corrections  to  the  data  itself  must  be  maintained.  Use  of  the 
OGC  standards  provides  the  ability  to  track  such  information 
in  a  manner  that  not  only  accompanies  the  real-time  data  but 
also  can  provide  persistent  reference  for  the  long-term. 
Applying  a  common  framework  to  communicate  the  history  of 
a  sensor,  data  processing  and  results  can  create  a  shared 
understanding  of  the  data  and  aid  in  enabling  the  machine-to- 
machine  integration  of  these  data. 

II.  QA/QC  and  Metadata  Considerations 

The  series  of  QARTOD  workshops  addressed  determining 
standards  for  QA/QC  and  metadata  for  real-time  ocean  data. 
Taking  the  community  approach  to  this  work  helps  ensure  that 
the  practices  identified  are  those  accepted  by  and  applied  by 
the  data  collectors.  Each  QARTOD  group  (waves,  in  situ 
currents,  CTD,  and  DO)  is  at  a  different  stage  of  completion. 
The  waves  and  in  situ  currents  groups  are  farthest  along, 
having  identified  QC  tests  and  QA  and  metadata  needed  to 
support  the  real-time  observations.  Tables  1  and  2,  provide  a 
sampling  of  the  QARTOD-recommended  QC  tests  for  waves 
and  in  situ  currents,  respectively.  These  examples  provide  not 
only  the  test,  but  also  a  recommended  action  or  flagging  to  be 
used  based  on  the  test  results  for  given  criteria. 

In  addition  to  the  QC  tests,  each  group  identified  QA  best 
practices  and  related  information.  With  the  understanding  that 
the  most  appropriate  procedures  for  a  given  technology  be 
applied,  documenting  or  logging  instrument  pre-release, 
deployment  and  post-recovery  activities  was  emphasized  by 
all  groups.  Also  important  is  recording  events,  such  as 
instrument  servicing  (changing  batteries,  cleaning  faces, 
replacing  membranes,  etc.)  or  notable  environmental  factors 
(e.g.,  biofouling,  meteorological  events).  Maintenance  and 
storage  of  sensors,  when  not  deployed,  also  contribute  to  the 
sensor  histories  and  can  play  a  role  in  evaluating  sensor 
performance. 


QA/QC  and  sensor  selection  discussions  included  reference  of 
manufacturer  specification  and  recommended  operational 
environments.  Many  of  these  sensor  characteristics  also 
contributed  to  the  metadata  content  recommendations.  Such 
sensor  information  as  sampling  rates  and  durations;  firmware 
versions;  and  calibration  dates,  methods,  and  coefficients  were 
combined  with  station,  platform  and  deployment  characteristics 
to  make  up  an  extensive  list  of  potential  metadata  [4].  Although 
no  final  requirements  were  laid  out  for  metadata  content, 
commonalities  were  identified  among  the  QARTOD  groups  on 
the  types  of  metadata  necessary.  Both  the  minimum  information 
that  needs  to  be  transmitted  with  the  data  and  the  complete  record 
containing  all  information  to  document  and  enable  users  to 
understand  the  quality  and  appropriateness  of  the  data  still  need  to 
be  refined  by  the  community. 


TABLE  1 

Example  of  QARTOD-recommended  tests  for  waves  [3] 


SPECTRAL  VALUES 

Category 

Criteria 

Action 

NON-DIRECTIONAL: 

Operational  fi  e<|iiency  r  ange 
test 

'defined  by  the 
eirvironment  and 
instrument 

1 

1.  Soft 

2.  Hard 

1.  Maxinin  user  defined. 

2.  Instrumeirt  spec  exceeded,  reject. 

DIRECTIONAL: 

Incident  low  frequency 
energy  direction 

Location  defined 

1 

Soft 

User  defined 

Check  factors,  ratio 

Should  he  appi  oximately 
=  1.  check  over  time. 
Location  dependent 

1 

Soft 

User  defined 

PARAMETER  VALUES 

Category 

Criteria 

Action 

Wave  parameters  maxinin 
acceptable  i  ange  (Height. 
Period.  Direction.  Diiectional 
Spread) 

Location  dependent 

1 

1.  Soft 

2.  Hard 

User  defined 

1.  flag  values  oirtside  expected  limits 

2.  reject  entire  record  if  H  exceeds  gross 
limit  otherwise  reject  individual 
parameter. 

Time  continuity 

Short  range  history 
(applied  to  H) 

2 

Soft 

User  defined 

TABLE  2 

Example  of  QARTOD-recommended  tests  for  in  situ 
CURRENTS  USING  TELEDYNE  RD  INSTRUMENTS  ADCP  [4] 


Test 

Pass 

Suspect 

Fail 

BIT  status  (Built  In 
Test)  -  diagnostics 

BIT  result  is  zero 

BIT  result  is  non-zero 

BIT  result  is  NA 

Echo 

amplitude/intensity 

value  between  70  and 

220  counts 

values  between  60 

and  70  counts 

values  greater  than 

220  counts;  values 
less  than  60  counts 

Pitch/Roll  (absolute 
value) 

0-15  deg 

15-20  deg 

>20deg 

UV  -  Horizontal 
velocity 

The  velocity 
magnitude  is  less  than 
or  equal  to  220  cm/s 
for  00  38  (BB  and 
NB),  OS  38,  75  and 
150  (BB  and  NB),  WH 
Long  Ranger  75  (BB 
and  NB)  and  WH 
QuarterMaster  150 
(BB  and  NB),  WH 

300,  600  and  1200 
kHz  (BB  and  NB) 

The  velocity 
magnitude  is  greater 
than  220  cm/s  and 
less  than  or  equal  to 
300  cm/s  for  OO  38 
(BB  and  NB),  OS  38, 
75  and  150  (BB  and 
NB),  WH  Long  Ranger 
75  (BB  and  NB)  and 
WH  QuarterMaster 

150  (BB  and  NB),  WH 
300,  600  and  1200 
kHz  (BB  and  NB) 

The  velocity 
magnitude  is  greater 
than  300  cm/s  for  OO 

38  (BB  and  NB),  OS 

38,  75  and  150  (BB 
and  NB),  WH  Long 
Ranger  75  (BB  and 

NB)  and  WH 
QuarterMaster  150 
(BB  and  NB),  WH 

300,  600  and  1200 
kHz  (BB  and  NB) 

III.  Applying  Sensor  Web  Enablement 
A.  Describing  the  System 

The  Martha’s  Vineyard  Coastal  Observatory  (MVCO; 
http://www.whoi.edu/mvco),  owned  and  operated  by  the 
Woods  Hole  Oceanographic  Institution  (WHOI),  provided 
the  testbed  for  the  demonstration  of  the  first  part  of  the 
project.  The  MVCO  is  comprised  of  a  shore  station,  a 
meteorological  mast,  a  12-m  node,  and  an  air-sea 
interaction  tower  (Fig.  1).  Each  of  these  components  can 
include  a  number  of  instruments  and  sensors.  In  describing 
the  waves  measurements  from  MVCO,  the  system’s 
components  are  characterized  using  a  number  of  SensorML 
files.  SensorML  was  selected  because  it  is  specifically 
designed  to  describe  systems  and  configurations  of 
systems,  as  well  as  the  processes  by  which  measured 
properties  are  transformed  into  observations. 


Martha's  Vineyard  Coastal  Observatory 

■  Shore  Lab 

*  Meteorological  Mast 

*  Solent  Model  R] 

■  Wind  Speedr  Wind  Direction 

■  VaiPTU 

■  Air  Temperature,  Relative  Humidity,  Air 
Pressure 

■  Eppley  Model  PSP  [Precision  Spectral  Pyrano meter) 

*  Solar  Radiation 

■  Eppley  Model  PIR  [Precision  Infrared  Radiometer  or 
Pyngeo  meter) 

*  Infrared  Radiation 

*  12-m  Offshore  Node 

*  TRDIADCP 

*  Wave  Height,  Wave  Period,  Wave  Direction, 
Wave  Spectra,  Water  Temperature,  Current 
Speed,  Cbrrert  Diredtion 

■  ParoScientific  Pressure  Gauge 

■  Tide 

■  Seabird  Sensor 

■  Salinity 

■  Air-Sea  Interaction  Tower 


Figure  1 .  MVCO  Components 


The  modular  approach  used  in  describing  the  MVCO 
allows  for  independent  descriptions  of  the  components  that 
can  be  linked  and  reused  by  various  configurations.  For  the 
Acoustic  Doppler  Current  Profiler  (ADCP)  waves 
measurements,  SensorML  files  are  developed  for  the 
observatory,  the  12-m  node,  and  the  ADCP  sensor,  which 
includes  both  a  manufacturer/model-level  file  and  an 
instance/serial  number-level  file.  These  files  are  listed  below 
with  a  brief  description  of  some  of  the  key  contents. 

•  MVCO:  Owner  and  operator  contact  information. 
List  of  and  links  to  four  major  components  of  the 
observatory. 

•  12_m_node:  Position  information  and  coordinate 
reference  system.  List  of  and  links  to  instrumentation 
associated  with  12-m  node. 

•  RDI_Workhorse_1200:  A  general  SensorML 
description  of  the  Teledyne  RDI  Workhorse  Model 
1200.  Technical  specifications  and  system 
characteristics  for  this  model.  Manufacturer  and 
technical  point  of  contact  information.  Note  (1): 
This  points  to  references  from  the  manufacturer  and 
can  be  used  by  anyone  who  is  using  a  Teledyne 
RDI_Workhorse  1200.  Note  (2):  Work  on  the 
development  of  the  Workhorse  1200  SensorML  is 
still  ongoing. 

•  MVCO_Workhorse_1200:  A  SensorML  description 
entailing  details  about  the  specific  MVCO  instance  of 
a  Teledyne  RDI  Workhorse  Model  1200  and  the 
ProcessModels  that  operate  on  individual  data  points. 
It  describes  the  setup  at  the  MVCO  and  specifies 
particulars,  such  as  sampling  frequency,  reporting 
frequency,  and  burst  length.  It  also  refers  to 
operational  points  of  contact  and  time-stamped 
service  events  that  occur  which  may  affect  the  quality 
of  the  observation  (e.g.,  a  failed  pressure  port  and  its 
replacement,  a  cleaned  ADCP  face). 

B.  Describing  the  Data  Processing 

Tracking  the  quality  of  the  data  is  aided  by  the  ability  to 
describe  the  workflow  processes  for  the  measurements  and  the 
QC  procedures  applied  by  the  different  data  providers.  To 
facilitate  this  capability,  SensorML  was  employed.  In  SensorML, 
all  components  are  modeled  as  processes.  The  building  blocks  of 
the  SensorML  descriptions  are  ProcessChain,  ProcessModel, 
System  and  Component.  ProcessChain  and  ProcessModel  refer  to 
nonphysical  composite  and  atomic  processes,  respectively. 
Component  refers  to  an  atomic  sensor  while  System  refers  to  a 
collection  of  Components  such  as  a  system  of  sensors  (e.g.,  a 
CTD). 

The  SensorML  files  are  used  for  describing  the  data 
processing  and  include  ProcessChains  that  string  together  the 
individual  components  and  ProcessModels.  For  the  MVCO,  that 
top-level  SensorML  file  links  sensor  and  lineage  descriptions,  the 
process  components,  and  the  input  and  output  of  each  process 
step.  Fig.  2  depicts  a  flow  diagram  of  the  ADCP_System  and 


shows  how  the  QC  tests  are  incorporated  into  this  data  model 
and  description  within  SensorML.  Each  part  is  represented  by 
its  own  SensorML  file.  The  following  material  lists  the 
SensorML  documents  used  to  describe  the  MVCO  ADCP 
System,  including  the  processes  and  general  QC  tests. 

•  ADCP_System  -  Main  SensorML  description  that  pulls 
together  processes,  tests,  and  the  system  components, 
RDI_Workhorse_1200  and  MVCO_Workhorse_1200. 

Process  modules  include: 

•  Pressure_QC_Chain  -  General  ProcessChain  for 
Pressure  time  series  data. 

•  Velocity_QC_Chain  -  General  ProcessChain  for 
Velocity  time  series  data. 

•  Pressure_QC_Chain  Values  -  ProcessChain  for  Pressure 
time  series  data  with  parameters  configured  for  MVCO 
setup. 

•  Velocity_QC_Chain_Values  -  ProcessChain  for 
Velocity  time  series  data  with  parameters  configured  for 
MVCO  setup. 

•  Pressure_Obs_Process  -  Chain  that  generates  a  number 
of  observable  properties,  such  as  wave  height  and 
period,  from  the  cleaned,  interpolated  time  series  that  is 
output  from  Pressure_QC_Chain. 

•  Velocity_Obs_Process  -  Chain  that  generates  a  number 
of  observable  directional  wave  properties  from  the 
cleaned,  interpolated  time  series  that  is  output  from 
Velocity_QC_Chain. 

•  TimeSeriesChain  -  ProcessChain  composed  of  several 
individual  processes  that  perform  time-related  QC 
checks  on  the  Pressure  and  Velocity  Series  data. 

QC  test  modules  include: 

•  DataGapTest  -  ProcessChain  composed  of  several 
individual  processes  that  perform  time-related  QC 
checks  on  the  Pressure  and  Velocity  Series  data. 


Figure  2.  Each  ProcessChain  documents  input  into  the  system,  a  description 
of  the  ProcessModels,  including  QC  tests  and  Components,  and  its  output.  [5] 


•  RangeSeriesTest  -  Test  to  determine  if  a  data  point 
lies  between  an  upper  and  lower  bound.  Operates  on 
a  data  series. 

•  RangeTest  -  The  atomic  Process  for  Range  checking 
a  single  point.  This  is  used  in  several  places  in  the 
ADCP  Q20  framework. 

•  MinThresholdSeriesTest  -  Like  RangeTest,  but  only 
operates  on  a  lower  bound.  Operates  on  a  data  series. 

•  MinThresholdTest  -  The  atomic  Process  for  testing  if 
a  data  value  exceeds  a  lower  bound. 

•  SpikeTest  -  ProcessChain  for  SpikeTest. 

Each  ProcessChain  encapsulates  one  or  more  elements 
which  can  be  either  tests  or  other  chains.  The  ProcessChain 
describes  the  data  flow  via  inputs,  outputs,  and  parameters.  A 
series  of  connections  serves  to  describe  the  linkage  between  these 
elements.  Several  of  the  tests  and  processes  use  specific 
parameters.  These  criteria  for  evaluating  the  data,  such  as  a 
maximum  value  for  wave  height,  can  be  included  inline  or 
declared  externally  and  coupled  to  the  appropriate  process  by 
SensorML-aware  software.  The  MVCO  SOS  serves  these 
parameters  as  another  SOS  offering,  so  that  the  parameter 
values  used  for  any  time  in  the  archive  can  be  retrieved  at  a  later 
date. 

The  flexibility  of  the  SWE  framework,  as  seen  in  the 
MVCO  ADCP  System,  can  support  all  the  elements  of  how 
the  data  were  collected  and  what  was  done  to  it  as  well  as  the 
complexity  and  provenance  of  tests  applied.  While  this 
framework  is  excellent  for  the  real-time  data  use,  it  also 
provides  information  required  for  its  secondary  use  or  reuse, 
as  well  as  the  submission  to  archives. 

C.  Developing  Vocabularies 

Terms  referred  to  in  the  SensorML,  from  the  input  observables 
to  the  resulting  test  flags,  should  reference  a  meaningful, 
resolvable  definition.  Wherever  possible,  existing  vocabularies 
can  be  referenced;  however,  for  this  work  with  the  QARTOD 
quality  tests  and  processes,  registered  vocabularies  did  not  exist. 
During  a  Q20  workshop  in  June  2008,  discussions  on  vocabulary 
development  resulted  in  an  approach  for  content  requirements 
summarized  in  Table  3. 

In  developing  the  Q20  and  MVCO  vocabularies,  an  attempt 
was  made  for  each  term  to  include  the  same  components.  The 
Q20  team  compiled  vocabulary  terms  for  the  QARTOD 
recommended  tests,  input  parameters,  QC  flags  and  bibliographic 
references.  The  MVCO  vocabulary  required  additional  categories 
to  capture  the  processing  (process  chains)  applied  to  data,  as  well 
as  the  outputs  and  measurement  properties.  The  resulting 
vocabularies  compiled  for  Q20  and  MVCO  are  registered  in  the 
Marine  Metadata  Interoperability  (MMI)  Project’s  Ontology 
Registry  and  Repository  (http://mmisw.org/or).  This  registry 
provides  a  unique,  resolvable  Uniform  Resource  Locator  (URL) 
for  each  term,  and  the  categories  of  the  terms  are  carried  as  part 
of  this  URL.  In  addition,  vocabulary  terms  registered  with  MMI 
can  be  mapped  and  related  to  other  vocabularies  and  knowledge 
domains,  further  enabling  semantic  interoperability. 


TABLE  3. 

Q20  Vocabulary  Guidance 


Name 

Definition 

Example 

Identifier  * 

unique  expression 

rangeTest 

(http://mmisw.org/ont/q2o/ 
test/ rangeTest) 

Long  name  * 

official  name;  human 
readable  label;  not 
necessarily  the  common 
name 

Range  Test 

Short  Name 

descriptive  or  commonly 
referred  to  name  or  label; 
can  be  the  same  as  long 
name 

Range  Test 

Definition  * 

the  formal  statement  of  the 
meaning  or  significance  of 
a  term;  note:  multiple 
definitions  must  not  be 
conflicting 

The  check  to  ensure  that  all 
measurements  or  values  fall 
within  established  upper  and 
lower  limits. 

Symbol 

sign  used  to  represent  an 
element,  quantity,  quality, 
operation  or  relation 
(e.g.,"Hs,  Td") 

n/a 

Reference 

source  report,  publication, 
document  or  other  record; 
creating  a  reference  list  as 
part  of  the  vocabulary 
allows  for  a  resolvable  link 
to  a  citation 

Fourth  Workshop  on  the 

Quality  Assurance  of  Real¬ 
time  Data,  Final  Report, 
QARTOD-IV  Woods  Hole 
Oceanographic  Institute, 
Woods  Hole,  MA,  June  21- 
23,  2006.  (http://mmisw.org/ 
ont/q2o/reference/q4) 

Figure 

a  graphical  representation 
or  image  that  help  explain 
or  is  referenced  in  the 
definition;  should  be 
included  as  a  persistent 
url  if  possible 

n/a 

Category 

a  grouping  or  classification 
of  the  terms;  for  Q20  this 
is  a  distinction  among  a 
test,  a  parameter,  a  flag  or 
a  reference. 

Test 

(http://mmisw.org/ont/q2o/ 

test/rangeTest) 

Relationship 

terms  associated  with  the 
identified  term  (e.g., 
parameters  that  are  inputs 
to  tests  or  resultant  flags 
that  are  outputs  from 
tests);  ontological  links  to 
other  objects 

http://mmisw.org/ont/q2o/ 

parameter/minimum 

http://mmisw.org/ont/q2o/ 

parameter/maximum 

http://mmisw.org/ont/q2o/ 

parameter/flag 

Equation 

A  symbolic  representation 
showing  the  kind  and/or 
amount  of  the  starting 
inputs  and  products 
(outputs)  of  a  process; 
could  be  included  as  a 
persistent  url  link  to  a 
document  or  image  or  a 
urn  to  a  MathML  file  (e.g., 
xmlns="http://www.w3.org/ 

1 998/Math/MathML") 

min  >  x  and  x  <  max 

Note 

explanatory  comment  or 
brief  record 

A  more  general  name  for 
all  types  of  specific  range 
checks. 

*  determined  to  be  a  required  element  for  a  vocabulary 


For  Q20,  we  tried  to  limit  the  vocabulary  development  to 
those  terms  that  are  unique  or  have  distinct  definitions  related  to 
the  QARTOD  tests.  Rather  than  defining  characteristics  of  the 
sensors  or  instruments,  we  have  encouraged  manufacturers  to 
register  the  terms  that  describe  their  products  and  processing. 
This  will  allow  operators  to  point  to  a  common,  authoritative 
vocabulary  for  an  instrument  and  not  unnecessarily  redefine 
the  terms. 

Determining  whether  or  not  an  existing  vocabulary  is 
appropriate  for  use  must  be  done  with  the  awareness  of  the  full 


meaning  of  the  terms  as  they  are  defined.  Misunderstanding  and 
data  integration  problems  can  occur  if  similar  terms  have 
seemingly  minor  but  distinct  detail  variations.  For  example,  the 
registered  Climate  Forecast  (CF)  definition  of  water  pressure 
includes  a  definition  in  decibels,  while  the  output  of  the  MVCO 
system  is  in  cm,  so  an  MVCO  use  of  that  CF  term  could  lead  to 
uncertainty  about  a  value’s  unit  of  measure. 

IV.  Accessing  the  Data  and  Metadata 

One  piece  of  the  SWE  framework  is  the  SOS.  Through  this 
web  service,  data  can  be  retrieved  from  sensors  and/or  sensor 
systems.  The  SOS  acts  as  an  intermediary  between  a  near-real 
time  sensor  channel  or  observation  repository  and  a  client. 
Along  with  the  data,  clients  can  use  SOS  to  obtain  metadata 
that  describes  the  sensors,  platforms,  and  processing  applied  to 
the  data. 

A.  SOS  Core  Operations 

Three  core  operations  are  mandatory  with  SOS: 
GetCapabilities,  DescribeSensor  and  GetObservation.  Access 
to  the  SOS  service  metadata  containing  information  about  the 
observation  offerings  (the  data  being  served)  is  through  the 
GetCapabilities  operation.  The  DescribeSensor  operation  retrieves 
detailed  information  about  the  sensors  and  processes  generating  the 
measurements  or  observations.  The  GetObservation  operation 
provides  access  to  the  sensor  observation  and  measurement  data 
itself.  The  combination  of  these  three  operations  provides  a 
comprehensive  characterization  of  a  data  set. 

B.  MVCO  Implementation 

The  initial  Q20  implementation  of  SOS  returns  responses 
for  real-time  and  archived  wave  data  from  the  MVCO.  The 
DescribeSensor  operation  for  the  MVCO  includes  the 
observatory,  the  12-m  node  and  the  ADCP  characteristics, 
provenance  and  lineage  including  linked  SensorML  files  with 
QC  tests  and  parameters  used  in  processing.  The 
GetCapabilities  operation  provides  MVCO  ADCP  system 
metadata  and  notifications  for  the  six  possible  observation 
offerings  from  one  data  stream.  These  offerings  include 
options  for  only  the  data  that  has  passed  QC  testing  or  all  data 
with  the  associated  QC  flags.  The  flags  indicate  which  tests 
the  data  either  passed  or  failed.  Access  to  these  data  offerings 
is  through  the  Get-Observation  operation. 

C.  Resulting  Information  Returned 

Results  returned  from  the  SOS  operations  are  Extensible 
Markup  Language  (XML)  documents.  These  results  are 
generally  intended  for  machine  interpretation;  however,  some 
of  this  information  also  needs  to  be  “human  readable.”  SWE 
experts  from  the  University  of  Alabama  Huntsville  developed 
a  basic  web  application  for  displaying  SensorML  files  in  a 
tabular  form,  called  Pretty  View.  This  application  is  in  a  beta 
state  (http://vast.uah.edu/SensorMLforms/upload.jsp),  but 
supports  most  SensorML  constructs  in  its  present  form. 
Using  the  PrettyView  application  with  the  results  from  the 
MVCO  SOS  operations  lets  the  user  quickly  navigate  the 


content  of  the  XML  documents  and  link  QC  test  results  to  the 
processes  and  parameters  (test  criteria)  applied  to  the  data. 

The  data  returned  are  created  as  comma  separated  value 
(CSV)  text  wrapped  in  the  content-rich  SensorML  files  as  part 
of  the  GetObservations  operation.  Carrying  the  supporting 
metadata  with  the  data  extends  the  value  of  long-term  data  sets 
by  enabling  providers  to  serve  well  documented  sensor  and 
processing  history  with  their  offerings. 

V.  Next  Steps 

Continued  refinement  of  profiles  for  the  different  types  of 
observations  and  the  development  of  guidance  for  implementers 
continues.  Integration  of  these  capabilities  into  the  cookbooks  of 
the  OOSTethys/OpenlOOS  project  (http://www.oostethys.org)  is 
planned. 

Including  additional  metadata  content  in  SensorML, 
expanding  methods  that  can  transform  this  information  into 
other  required  metadata  standards,  and  providing  clients  that 
can  leverage  the  capabilities  of  the  Sensor  Web  are  all 
potential  areas  for  development. 

The  Q20  work  has  focused  on  demonstrating  the  use  of  SWE 
to  enable  the  QARTOD  QA/QC  recommendations.  The  content 
captured  in  these  SensorML  files  is  only  a  basic  step  in  the  use  of 
the  SWE  standards.  Richer  applications  of  SWE  for  the  sensors 
and  measurements  discussed  by  QARTOD  can  be  developed 
which  further  relate  sensors  to  co-located  or  duplicate  sensors, 
sensors  within  instrument  packages,  instruments  with  respect  to 
other  instruments  onboard  platforms,  or  platforms  as  part  of 
ocean  observatories. 

From  the  data  archive  perspective,  getting  the  information 
(metadata)  into  a  usable  form  from  the  SensorML  is  something 
that  needs  to  be  considered.  Automating  the  requests  for  data 
through  an  SOS  and  transforming  the  metadata  content  into  a 
standard  usable  by  both  an  archive  center  and  human-searchable, 
data-discovery  systems  would  be  beneficial.  Any  automation  of 
such  requests  can  be  incorporated  into  submission  agreements 
between  data  providers  and  archive  centers. 

VI.  Conclusions 

Ocean  data  climatologies  that  advise  mariners  of  typical 
conditions,  offer  engineers  the  probabilities  of  extreme 
environmental  conditions  for  designs  or  are  used  by  scientists 


to  examine  trends  and  impacts  on  ecosystem  health  are  built 
from  the  long-term  compilation  of  that  once  real-time  data. 
Reuse  or  secondary  use  of  data  for  climatology  purposes  or  for 
applications  that  extend  well  beyond  the  initial  data  collection 
considerations  require  an  understanding  of  the  data.  This 
means  that  the  lineage  of  the  data,  its  collection  and 
processing  method,  post-processing  corrections,  and  other 
history  must  be  readily  available.  Capturing  this  information 
from  the  start  of  the  collection  activities  is  key  to  the  data’s 
potential  use,  secondary-use  and  long-term  preservation. 
Using  standards  that  are  specifically  designed  for  sensor 
systems,  observations  and  measurements  such  as  those 
components  of  OGC  SWE  provides  a  means  to  both  capture 
and  access  the  data  and  metadata.  Combining  these  standards 
with  community-accepted  data  quality  and  information 
management  practices  helps  ensure  data  posterity. 
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